The example code shown here is designed to give you an experience at downloading, analyzing and visualising a dataset that is archived on the TERN EcoPlots https://ecoplots-test.tern.org.au/. EcoPlots contains plot-based ecology data from different sources to enable integrated search and access to data based on different jurisdiction, data sources, feature types, parameters and temporal extent.
This gets you started with storing your files and r scripts.
#setwd("C:/Users/uqasin21") # Note that your path will be different
If you do not have the following packages downloaded, you can install
them using the function install.packages, with the name of
the package as an argument in quotations.
library(tidyverse)
library(httr)
library(ggpubr)
At this point, you should have your Ecoplots search query from the EcoPlots API Query dash board (https://ecoplots-test.tern.org.au/discovery).
Important Make sure to download your API key from the website (https://account-test.tern.org.au/authenticated_user/apikeys) Once you have your API code extracted you can start using the query request
First step is to create a data frame, explore and organise it to suite your research needs. In this example file, we have occurrence of vertebrates from the Australian Wet Tropics Bioregion, compiled by Williams, 2006.
#create a data frame from the API response
df<-read.table(text = content(res, 'text'), sep =",", header = TRUE, stringsAsFactors = FALSE)
#data exploration and wrangling
head(df)
tail(df)
As you can see that there are over 13,000 observations providing information on the animal occurrence coordinates, scientific name and their conservation status.
We need to store the scientific name and regional government conservation status as a factor.
#data exploration and wrangling
df$scientificName<- as.factor(df$scientificName)
df$regionalGovermentConservationStatus<- as.factor(df$regionalGovermentConservationStatus)
summary(df$regionalGovermentConservationStatus)
## Endangered wildlife Least concern wildlife N/A
## 554 5539 7347
## Special least concern
## 233
From this we see that there are four levels of Conservation Status (Endanngered Wildlife, Leaf Concern Wildlife, Special Least Concern Wildlife and N/A).
Let us visualize them and plot on a map to see how they are distributed within the bioregion.
#data exploration and wrangling
plot(df$regionalGovermentConservationStatus)
#mapping
# Map
library(ozmaps)
library(sf)
aus <- ozmap_data(data = "states") # create a map layer with state boundaries
#Map based on Conservation status
m1<- ggplot() +
geom_sf(data = aus, fill = "#FBFBEF") +
geom_point(
data = df,
mapping = aes(
x = longitude_Degree,
y = latitude_Degree,
colour = regionalGovermentConservationStatus),
alpha = 0.5)+
theme_void() +
coord_sf(ylim = c(-11, -22),
xlim = c(141, 151))+ labs(x = expression(""), y = expression(""), colour="Conservation Status")+ theme_bw()
m1+ scale_color_manual(values=c("red", "steelblue", "grey2", "#CE5832"))+theme(axis.text.y = element_text(colour = "black", size = 14, face = "bold"),
axis.text.x = element_text(colour = "black", face = "bold", size = 14),
legend.text = element_text(size = 12, face ="bold", colour ="black"),
legend.position = "top", axis.title.y = element_text(face = "bold", size = 14),
axis.title.x = element_text(face = "bold", size = 14, colour = "black"),
legend.title = element_text(size = 14, colour = "black", face = "bold"),
panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
legend.key=element_blank())
As you can see from the map there are many hits for the Conservation status ‘Least concern wildlife’, and at least 500 odd hits point to ‘Endangered wildlife’ in the map. There are also significant proportion of the data with no Conservation Status (listed as N/A).
Let us create a new data frame with count of species occurrences and conservation status and visualize the levels of occurrence.
df2<- df%>% group_by(scientificName, regionalGovermentConservationStatus) %>%
summarise(no_rows = length(scientificName))
hist(df2$no_rows)
Here it appears that mojortity of the dataset contains species hits below 100 counts. Let us subset two dataframes- one of counts >200 and one with count <20 for example.
# high species occurrence ones
df3<- df2 %>% filter(no_rows > 200)
h1<- ggplot(df3, aes(x=no_rows, y=scientificName))+
geom_bar(stat = 'identity', fill = "#a84830")+theme(axis.text.y = element_text(angle = 10, hjust = 1))+labs(y="Scientific name",x="count")
h1+ theme(axis.text.y = element_text(colour = "black", size = 12, face = "bold"),
axis.text.x = element_text(colour = "black", face = "bold", size = 12),
legend.text = element_text(size = 12, face ="bold", colour ="black"),
legend.position = "none", axis.title.y = element_text(face = "bold", size = 12),
axis.title.x = element_text(face = "bold", size = 12, colour = "black"),
legend.title = element_text(size = 12, colour = "black", face = "bold"),
panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
legend.key=element_blank())
#let us fill this with the conservation status
h2<- ggplot(df3, aes(x=no_rows, y=scientificName, fill=regionalGovermentConservationStatus))+
geom_bar(stat = 'identity')+theme(axis.text.y = element_text(angle = 10, hjust = 1))+labs(y="Scientific name",x="count", fill="")
h2+ scale_fill_manual(values=c("red", "steelblue", "grey2", "#CE5832"))+theme(axis.text.y = element_text(colour = "black", size = 12, face = "bold"),
axis.text.x = element_text(colour = "black", face = "bold", size = 12),
legend.text = element_text(size = 12, face ="bold", colour ="black"),
legend.position = "top", axis.title.y = element_text(face = "bold", size = 12),
axis.title.x = element_text(face = "bold", size = 12, colour = "black"),
legend.title = element_text(size = 12, colour = "black", face = "bold"),
panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
legend.key=element_blank())
We observe that there are not many endangered species (other than Casuarius casuarius johnsonii ) in this smaller dataset with species counts of >200.
Let us see how many species are present with really low counts.
# high species occurrence ones
lowcounts<- df2 %>% filter(no_rows < 20)
l1<- ggplot(lowcounts, aes(x=no_rows, y=scientificName))+
geom_bar(stat = 'identity', fill = "#a84830")+theme(axis.text.y = element_text(angle = 10, hjust = 1))+labs(y="Scientific name",x="count")
l1+ theme(axis.text.y = element_text(colour = "black", size = 12, face = "bold"),
axis.text.x = element_text(colour = "black", face = "bold", size = 12),
legend.text = element_text(size = 12, face ="bold", colour ="black"),
legend.position = "none", axis.title.y = element_text(face = "bold", size = 12),
axis.title.x = element_text(face = "bold", size = 12, colour = "black"),
legend.title = element_text(size = 12, colour = "black", face = "bold"),
panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
legend.key=element_blank())
#let us fill this with the conservation status
l2<- ggplot(lowcounts, aes(x=no_rows, y=scientificName, fill=regionalGovermentConservationStatus))+
geom_bar(stat = 'identity')+theme(axis.text.y = element_text(angle = 10, hjust = 1))+labs(y="Scientific name",x="count", fill="")
l2+ scale_fill_manual(values=c("red", "steelblue", "grey2", "#CE5832"))+theme(axis.text.y = element_text(colour = "black", size = 12, face = "bold"),
axis.text.x = element_text(colour = "black", face = "bold", size = 12),
legend.text = element_text(size = 12, face ="bold", colour ="black"),
legend.position = "top", axis.title.y = element_text(face = "bold", size = 12),
axis.title.x = element_text(face = "bold", size = 12, colour = "black"),
legend.title = element_text(size = 12, colour = "black", face = "bold"),
panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
legend.key=element_blank())
As you can see that majority of the dataset contains low counts from Least Concern wildlife.
Let us now create a map of species occurrence based on different counts ordered as: ‘Low’- 0 - 249 counts; ‘Medium’- 250-500 counts; and ‘High’- >500 counts.
# make a data frame with high and low occurrence to map
df5<- df%>% group_by(scientificName, regionalGovermentConservationStatus) %>%
reframe(count = length(scientificName),
latitude=latitude_Degree,
longitude=longitude_Degree)
# Creating a factor corresponding to species distribution counts with labels
df5$occurrence = cut(df5$count, 3, labels=c('Low', 'Medium', 'High'))
table(df5$occurrence)
##
## Low Medium High
## 3983 4735 4955
#Map based on proportional count status
m1<- ggplot() +
geom_sf(data = aus, fill = "#FBFBEF") +
geom_point(
data = df5,
mapping = aes(
x = longitude,
y = latitude,
colour = occurrence),
alpha = 0.5)+
theme_void() +
coord_sf(ylim = c(-11, -22),
xlim = c(141, 151))+ labs(x = expression(""), y = expression(""), colour="counts")+ theme_bw()
m1+ scale_fill_manual(values=c("red", "steelblue", "grey2", "#CE5832"))+
theme(axis.text.y = element_text(colour = "black", size = 14, face = "bold"),
axis.text.x = element_text(colour = "black", face = "bold", size = 14),
legend.text = element_text(size = 12, face ="bold", colour ="black"),
legend.position = "top", axis.title.y = element_text(face = "bold", size = 14),
axis.title.x = element_text(face = "bold", size = 14, colour = "black"),
legend.title = element_text(size = 14, colour = "black", face = "bold"),
panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
legend.key=element_blank())
#Plotting the proportion of occurrence counts for each Conservation Status
b1<- ggplot(df5, aes(x = regionalGovermentConservationStatus, fill = occurrence)) +
geom_bar(aes(y = after_stat(count / sum(count)))) +
scale_y_continuous(labels = scales::percent)+labs(y="Proportion of counts",x="Conservation Status", fill="count")
b1+scale_fill_manual(values=c("peachpuff", "steelblue", "#CE5832"))+
theme(axis.text.y = element_text(colour = "black", size = 12, face = "bold"),
axis.text.x = element_text(colour = "black", face = "bold", size = 12),
legend.text = element_text(size = 12, face ="bold", colour ="black"),
legend.position = "top", axis.title.y = element_text(face = "bold", size = 12),
axis.title.x = element_text(face = "bold", size = 12, colour = "black"),
legend.title = element_text(size = 12, colour = "black", face = "bold"),
panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
legend.key=element_blank())
From the above map we can see that the data set contains high counts of ‘Endangered wildlife’, whereas for ‘Least concern wildlife’ the medium and high counts are more are less proportional and for the ‘Special least concern’ there is really low count. This is probably accounted for by the species Casuarius casuarius johnsonii (endangered) and Monarcha melanopsi (special least concern).